Enhanced Firefly-K-Means Clustering with Adaptive Mutation and Central Limit Theorem for Automatic Clustering of High-Dimensional Datasets
نویسندگان
چکیده
Metaheuristic algorithms have been hybridized with the standard K-means to address latter’s challenges in finding a solution automatic clustering problems. However, distance calculations required phase of hybrid increase as number clusters increases, and associated computational cost rises proportion dataset dimensionality. The use algorithm metaheuristic-based for high-dimensional real-world datasets poses great challenge performance resultant terms cost. Reducing computation time will inevitably reduce algorithm’s complexity. In this paper, preprocessing is introduced into an improved firefly-based using concept central limit theorem partition subgroups randomly formed subsets on which applied obtain representative cluster centers final procedure. enhanced firefly (FA) CLT-based automatically determine optimum centroids generate corresponding initial achieve optimal global convergence. Twenty from UCI machine learning repository are used investigate proposed algorithm. empirical results indicate that FA-K-means method demonstrates statistically significant superiority employed measures reducing problems, compared other advanced search variants.
منابع مشابه
Enhanced Clustering Based on K-means Clustering Algorithm and Proposed Genetic Algorithm with K-means Clustering
-In this paper targeted a variety of techniques, tactics and distinctive areas of the studies that are useful and marked because the crucial discipline of information mining technologies. The overall purpose of the system of statistics mining is to extract beneficial facts from a large set of information and changing it right into a shape that is comprehensible for in addition use. Clustering i...
متن کاملAdaptive K-Means Clustering
Clustering is used to organize data for efficient retrieval. One of the problems in clustering is the identification of clusters in given data. A popular technique for clustering is based on K-means such that the data is partitioned into K clusters. In this method, the number of clusters is predefined and the technique is highly dependent on the initial identification of elements that represent...
متن کاملAutomatic Scale Selection for Clustering of Correlated High-dimensional Datasets
Clustering algorithms usually have one or more parameters that control the scale at which the algorithm looks at the data. We study the problem of simultaneously selecting parameter values for multiple datasets (clustering instances), some of which are a priori known to have similar values. We propose two optimization problems related to this task. We show that one of them is NP-hard, and give ...
متن کاملA hybridized K-means clustering approach for high dimensional dataset
Due to incredible growth of high dimensional dataset, conventional data base querying methods are inadequate to extract useful information, so researchers nowadays is forced to develop new techniques to meet the raised requirements. Such large expression data gives rise to a number of new computational challenges not only due to the increase in number of data objects but also due to the increas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2022
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app122312275